OLAP over Imprecise Data with Domain Constraints

نویسندگان

  • Douglas Burdick
  • AnHai Doan
  • Raghu Ramakrishnan
  • Shivakumar Vaithyanathan
چکیده

Several recent works have focused on OLAP over imprecise data, where each fact can be a region, instead of a point, in a multidimensional space. They have provided a multiple-world semantics for such data, and developed efficient solutions to answer OLAP aggregation queries over the imprecise facts. These solutions however assume that the imprecise facts can be interpreted independently of one another, a key assumption that is often violated in practice. Indeed, imprecise facts in real-world applications are often correlated, and such correlations can be captured as domain integrity constraints (e.g., repairs with the same customer names and models took place in the same city, or a text span can refer to a person or a city, but not both). In this paper we provide a solution to answer OLAP aggregation queries over imprecise data, in the presence of such domain constraints. We first describe a relatively simple yet powerful constraint language, and define what it means to take into account such constraints in query answering. Next, we prove that OLAP queries can be answered efficiently given a database D∗ of fact marginals. We then exploit the regularities in the constraint space (captured in a constraint hypergraph) and the fact space to efficiently construct D*. Extensive experiments over real-world and synthetic data demonstrate the effectiveness of our approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Algorithms for Allocation Policies

Recent work [2] proposed extending the OLAP data model to represent data ambiguity. Specifically, one form of ambiguity that work addressed arose from relaxing the assumption that all dimension attributes in a fact are assigned leaf-level values from the underlying domain hierarchy. Such data was referred to as imprecise. Allocation was proposed by [2] as a mechanism to deal with imprecision. I...

متن کامل

Imprecise Data and Knowledge Based OLAP

In this paper we present our approach for extending the OLAP model to include treatment of value uncertainty as part of a multidimensional model inhabited by flexible data and non-rigid hierarchical structures of organisation. A new multidimensional-cubic model named as the IF-Cube is introduced which is able to operate over data with imprecision either in the facts or in the dimensional hierar...

متن کامل

Aspects of Data Modeling and Query Processing for Complex Multidimensional Data

This thesis is about data modeling and query processing for complex multidimensional data. Multidimensional data has become the subject of much attention in both academia and industry in recent years, fueled by the popularity of data warehousing and On-Line Analytical Processing (OLAP) applications. One application area where complex multidimensional data is common is within medical informatics...

متن کامل

Answering Imprecise Queries over Autonomous Databases

Current approaches for answering queries with imprecise constraints require users to provide distance metrics and importance measures for attributes of interest metrics that are hard to elicit from lay users. Moreover they assume the ability to modify the architecture of the underlying database. Given that most Web databases are autonomous and may have users with limited expertise over the asso...

متن کامل

Supporting Queries with Imprecise Constraints

In this paper, we motivate the need for and challenges involved in supporting imprecise queries over Web databases. Then we briefly explain our solution, AIMQ a domain independent approach for answering imprecise queries that automatically learns query relaxation order by using approximate functional dependencies. We also describe our approach for learning similarity between values of categoric...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007